Back to Dashboard

Note 4 Extended ER Features

Lecturer: Jane YOU Note Author: Zhen Tong

After we have the basic knowledge of Entity-Relationship Model, this week, let’s go deeper into ER design tips.

First of all let’s take this example, and see why we need the first thing we are caring about Specialization.

Specialization

Untitled

Overlapping Specialization: In an overlapping specialization scenario, entities can belong to multiple specialized sub-groups simultaneously. Consider an entity set "Person" and two sub-groupings "Employee" and "Student." In this case, a person can be both an employee and a student at the same time, which is why it's considered overlapping specialization.

In this example, a person can be a student and an employee at the same time, as some individuals work part-time while pursuing their studies.

Disjoint Specialization:

Let’s assume in this problem the secretary can’t be an instructor(which is a common case because they only study management🥱). In a disjoint specialization scenario, entities can belong to only one specialized sub-group. Consider an entity set "Employee" and two sub-groupings "Instructor" and "Secretary." In this case, a person can be either an instructor or a secretary, but not both, which is why it's considered a disjoint specialization.

Okay, now let’s take over this problem. How do we do?

A very intuitive way is you create a separate database schema for each class (or entity) in the specialization hierarchy. This is the specialization via schemas also known as the "Schema per Class Hierarchy" approach. Here's how it works and its drawbacks:

1. Form a Schema for the Higher-Level Entity:

2. Form a Schema for Each Lower-Level Entity Set:

Drawback: The main drawback of this approach is that it can lead to complex queries and potential performance issues when retrieving information about a specific entity. Here's why:

However, this approach offers the advantage of maintaining data integrity and ensuring that each entity type has its own specific attributes. It's suitable when there are substantial differences between entities in terms of attributes.

Another way is to represent specialization as schemas is known as the "Single Table Inheritance" approach. In this approach, you create a single database schema for each entity in the specialization hierarchy. Each schema includes all the local attributes for that entity and any inherited attributes.

1. Form a Schema for Each Entity Set:

schemaattributes
personID, name, address
studentID, name, address, GPA
employeeID, name, address, salary

Drawback: The primary drawback of this approach is data redundancy.

Despite the data redundancy drawback, this approach has the advantage of simplifying queries. You don't need to perform joins to retrieve data specific to a particular entity type, as all the data is available in a single table.

Completeness Constraint

Next, think of a bigger question, is a person necessary to be either a student or an employee? The answer depends on the problem context but in reality, someone like a visiting scholar can not fit into these two categories. This is the Completeness Constaint problem.

It specifies whether an entity in the higher-level entity set must belong to at least one of the lower-level entity sets within a generalization. There are two types of completeness constraints: total and partial.

  1. Total Completeness Constraint:

    • In a total completeness constraint, it is specified that every entity in the higher-level entity set must belong to one of the lower-level entity sets. In other words, there are no entities that exist solely at the higher level; they are always specialized into one of the lower-level entities.

    Example: If we have a generalization hierarchy that combines "Person," "Student," and "Employee," with a total completeness constraint, every person must be either a student or an employee, ensuring that no person exists without one of these specific roles.

  2. Partial Completeness Constraint:

    • In a partial completeness constraint, it is specified that an entity in the higher-level entity set may or may not belong to one of the lower-level entity sets. In this case, there can be entities at the higher level that are not specialized into any of the lower-level entities.

Consider an entity-relationship model that represents vehicles, with a generalization hierarchy involving "Vehicles," "Cars," and "Trucks."

  1. Total Completeness Constraint:

    • Total completeness would mean that every entity in the "Vehicles" higher-level entity set must belong to one of the lower-level entity sets, "Cars" or "Trucks." No vehicle can exist without being classified as either a car or a truck.

      Untitled

      Untitled

  2. Partial Completeness Constraint:

    • Partial completeness allows for the possibility of entities in the higher-level "Vehicles" entity set that are not specialized into "Cars" or "Trucks." Some vehicles, like a bus, can remain unclassified in this case.

      Untitled

Aggregation

Recall that aggregation is often used when you have a complex entity that can be broken down into smaller, related entities. In this case, you're discussing the ternary relationship proj_guide, which involves students, instructors, and projects, and you want to model evaluations of students by instructors on projects using an entity called "evaluation."

Untitled

******************Reduce to Relational Schemas******************

Now how can we take this aggregation into Relational Schemas?

The way we do this is encode the aggregation directly into a schema, because it has the nature of a table.

  1. eval_for Schema:

    • This schema will capture the aggregation of data related to evaluations. It includes attributes from the "evaluation" entity and links them to the primary keys of the associated entity sets.

    • s_ID is the primary key from the "student" entity.

    • project_id is the primary key from the "project" entity.

    • i_ID is the primary key from the "instructor" entity.

    • evaluation_id is the primary key of the "evaluation" entity.

Here's what the "eval_for" schema might look like in SQL-like notation:

Entities vs Attributes

Let's evaluate the two designs in the scenario of dealing with instructor information and phone numbers

Untitled

Attributes Design: Instructor as an entity with attributes:

Entities Design: Instructor and phone as separate entities with a relationship:

Here's a comparison of the two designs:

Attributes Design:

Entities Design:

Entities vs Relationship Sets

In database modeling, the choice between using entity sets and relationship sets depends on how you want to represent the data and the nature of the associations between entities.

Untitled

Here's an analysis of the example you provided, which involves the "section," "registration," and "student" entities:

Entity Sets Approach:

In this approach, you represent both students and sections as entities, and you use a relationship to associate them:

In this design:

Relationship Sets Approach:

In this approach, you represent the "registration" as an entity itself to describe the relationship:

In this design:

Which Approach to Choose:

Consider your specific modeling requirements and whether you want to capture additional information about the registration process when deciding between these two approaches.

Non-Binary to Binary

No offense 🌈 but converting non-binary relationships to binary form is a common practice in database design when you want to represent relationships involving more than two entities using binary relationships. This approach simplifies data modeling and maintains the integrity and clarity of the relationships. Here's how it's done:

Untitled

  1. Create a Middle Artificial Entity Set:

    • To represent the non-binary relationship, you introduce an artificial entity set (let's call it "E"). This artificial entity set will serve as a connector between the original entities involved in the non-binary relationship.

  2. Create Binary Relationship Sets:

    • For each original entity set involved in the non-binary relationship (let's call them A, B, and C), create a binary relationship set between "E" and the respective entity set. This effectively converts the non-binary relationship into several binary relationships.

  3. Assign Entities to Binary Relationships:

    • For each tuple (a, b, c) in the non-binary relationship (let's call it "R"), you create a new entity (let's call it "e") in the artificial entity set "E." Then, you establish relationships between "e" and each of the original entities (a, b, and c) using the binary relationship sets created in step 2.